An automatic pitch model with distance function
نویسندگان
چکیده
Pitch modelling is considered to be an important factor in speech synthesis where the pitch contour plays a demonstrable role in the intelligibility and naturalness of synthesised speech. While quantitative models for pitch contours have been proposed previously, each of these have a fixed level of details and as such not all of them provide the basis either for automatic extraction of pitch model parameters or for measuring the distance between two instances of a model. In this paper, a novel and compact quantitative model for pitch contour is presented which covers the possible variations in pitch and can be automatically extracted. The minimum F0 value, the level global slope of a pitch segment and the semi-periodic jitter properties are used as pitch components and are modelled with a linear function, a sine function and a set of sine functions respectively. A distance measure is defined for the model which takes the shape of the contours into consideration. Experiments show a low mean square error (MSE) for the estimated contours for different languages across different corpora, and investigate the accuracy of the distance function on the model.
منابع مشابه
Dimensionality Reduction and Improving the Performance of Automatic Modulation Classification using Genetic Programming (RESEARCH NOTE)
This paper shows how we can make advantage of using genetic programming in selection of suitable features for automatic modulation recognition. Automatic modulation recognition is one of the essential components of modern receivers. In this regard, selection of suitable features may significantly affect the performance of the process. Simulations were conducted with 5db and 10db SNRs. Test and ...
متن کاملThe Function of Pitch Range Variations in Samples of Emotional Expressions in Persian
This study aims at investigating the interface between emotion and intonation patterns (more specifically, duration and pitch amplitude of speech). To this end, the acoustic properties of spectral parameters related to speech prosody are investigated. The results of acoustic and Statistical analysis show that mean level and range of FO in the contours vary strongly as a function of the degree o...
متن کاملAn Acoustic Study of Emotivity-Prosody Interface in Persian Speech Using the Tilt Model
This paper aims to explore some acoustic properties (i.e. duration and pitch amplitude of speech) associated with three different emotions: anger, sadness and joy against neutrality as a reference point, all being intentionally expressed by six Persian speakers. The primary purpose of this study is to find out if there is any correspondence between the given emotions and prosody patterning in P...
متن کاملA Shift-Invariant Latent Variable Model for Automatic Music Transcription
In this work, a probabilistic model for multiple-instrument automatic music transcription is proposed. The model extends the shift-invariant probabilistic latent component analysis method, which is used for spectrogram factorization. Proposed extensions support the use of multiple spectral templates per pitch and per instrument source, as well as a time-varying pitch contribution for each sourc...
متن کاملType-2 fuzzy logic based pitch angle controller for fixed speed wind energy system
In this paper, an interval Type-2 fuzzy logic based pitch angle controller is proposed for fixed speed wind energy system (WES) to maintain the aerodynamic power at its rated value. The pitch angle reference is generated by the proposed controller which can compensate the non-linear characteristics of the pitch angle to the wind speed. The presence of third dimension in the Type-2 fuzzy logic c...
متن کامل